Recovery of Block-Sparse Representations from 
Noisy Observations via Orthogonal Matching 

Pursuit 

Jun Fang, Member, IEEE, and Hongbin Li, Senior Member, IEEE 



o 

(N ■ 

Oh' 
D 

\D 
(N 



> 

o 

m 

in 
O 



X 



Abstract — We study the problem of recovering the sparsity pat- 
tern of block-sparse signals from noise-corrupted measurements. 
A simple, efficient recovery method, namely, a block-version of 
the orthogonal matching pursuit (OMP) method, is considered 
in this paper and its behavior for recovering the block-sparsity 
pattern is analyzed. We provide sufficient conditions under 
which the block-version of the OMP can successfully recover 
the block-sparse representations in the presence of noise. Our 
analysis reveals that exploiting block-sparsity can improve the 
recovery ability and lead to a guaranteed recovery for a higher 
sparsity level. Numerical results are presented to corroborate our 
theoretical claim. 

Index Terms — Block-sparsity, orthogonal matching pursuit, 
compressed sensing. 

I. Introduction 

The problem of recovering a high dimensional sparse signal 
based on a small number of measurements has been of 
significant interest in signal and imaging processing, applied 
mathematics, and statistics. Such a problem arises from a 
number of applications, including subset selection in regres- 
sion [1], structure estimation in graphical models [2], and 
compressed sensing [3]. Among these applications, many 
involves determining the locations of the nonzero components 
of the sparse signal, which is also referred to as sparsity pattern 
recovery (or more simply, sparsity recovery). In practice, the 
locations of the nonzero components (or, the support of the 
sparse signals) usually have significant physical meanings. 
For example, in chemical agent detection, the indices for 
the nonzero coordinates indicates the chemical components 
present in a mixture. In sparse linear regression, the recovered 
support corresponds to a small subset of features which 
linearly influence the observed data. Due to its importance, 
sparsity pattern recovery has received considerable attention 
over the past few years. In [4], [5], the authors analyzed 
the behavior of i?i -constrained quadratic programming (QP), 
also referred to as the Lasso, for recovering the sparsity 
pattern in a deterministic framework. Sufficient conditions 
were established for exact sparsity pattern recovery. Such a 
problem was also studied in [6] from a statistical perspective, 
where necessary and sufficient conditions on the problem 
dimension, the number of nonzero elements, and the number 
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of measurements are established for sparsity pattern recovery. 
Recently, information-theoretic limits of sparsity recovery with 
an exhaustive search decoder were studied in [7], [8]. 

In this paper, we consider the problem of recovering block- 
sparse signals whose nonzero elements appear in fixed blocks. 
Block-sparse signals arise naturally. For example, the atomic 
decomposition of multi-band signals [9] or audio signals [10] 
usually results in a block-sparse structure in which the nonzero 
coefficients occur in clusters. Recovery of block-sparse signals 
has been extensively studied in [11]-[13], in which the re- 
covery behaviors of the basis pursuit (BP), or -constrained 
QP, and the orthogonal matching pursuit (OMP) algorithms 
were analyzed via the restricted isometry property (RIP) [12], 
[13] and the mutual coherence property [11]. Their analyses 
[11]-[13] revealed that exploiting block-sparsity yields a re- 
laxed condition which can guarantee recovery for a higher 
sparsity level as compared with treating block-sparse signals 
as conventional sparse signals. Nevertheless, most of these 
studies focused on noiseless scenarios. In practice, measure- 
ments are inevitably contaminated with noise and underlying 
uncertainties. It is therefore important to analyze the effect 
of measurement noise on the block-sparsity pattern recovery, 
e.g. under what conditions the exact sparsity pattern can 
be recovered, and does exploiting block-sparsity still lead 
to a guaranteed recovery for a higher sparsity level? These 
questions will be addressed in this paper. Specifically, we 
consider a block version of the OMP algorithm and study its 
behavior for recovering block-sparsity pattern in the presence 
of noise. A comparison with the theoretical results for the 
conventional OMP algorithm [5] is presented to highlight the 
benefits of exploiting block-sparsity property. 

II. Problem Formulation 

We consider the problem of recovering a block-sparse signal 
X e ]R" from noise-corrupted measurements 



Ax 



(1) 



where A e j^mx" (jyi <; ^i) is the measurement matrix with 
unit-norm columns, and w is an arbitrary and unknown vector 
of errors. To define block-sparsity, as in [11], we model x as 
a concatenation of equal-length blocks 



(2) 



where x; = [x(i-i)d+i ■ ■ ■ xid]^ is a d-dimensional vector. 
Clearly, the vector x has a dimension n = Ld, and the 
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vector is called block X-sparse if its block component x; has 
nonzero Euclidean norm for at most K indices I. Similarly, the 
measurement matrix A can be expressed as a concatenation 
of column-block matrices {A;}^-^ 



[Ai A2 



(3) 



where A, € R™^'*. Also, we assume that the number of 
rows of A is an integer multiples of d, i.e. m = Rd 
with R an integer The conventional coherence metric of the 
measurement matrix A is defined as 



max|afaj 



(4) 



where denotes the ith column of A. This coherence 
metric, albeit useful, is not sufficient to characterize the block- 
structure of the sparse signal. To exploit the block-sparsity 
property, we define the block-coherence iig and sub-coherence 
ly (these two concepts were firstly introduced in [11]): 



Mb 



: max 



^P(AfA,) 



= max max 
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where p(X) denotes the spectral norm of X, which is defined 
as the square root of the maximum eigenvalue of X^X, i.e. 
-\/ Amax(X^X). Related properties of the block-coherence hb 
can be found in [11]. We see that /ie quantifies the coherence 
between blocks of A, while the coherence within blocks is 
characterized by the sub-coherence v. 

The objective of this paper is to identify sufficient conditions 
on the measurement matrix A (in terms of the block-coherence 
/iB and the sub-coherence ly), as well as the signal vector 
X and the error vector w, under which the block-sparsity 
pattern can be recovered from the noisy measurements. We 
are particularly interested in analyzing the recovery ability of 
a block-version of the orthogonal matching pursuit (OMP). 
OMP is a simple greedy approximation algorithm developed 
in [14], [15]. Despite its simplicity, OMP is a provably good 
approximation algorithm which achieves performance close to 
Lasso in certain scenarios [16], [17]. In the following, we 
briefly summarize the block-version of the OMP, which is also 
termed as block-OMP (BOMP). This BOMP is a slight variant 
of the original BOMP that was introduced in [11] for noiseless 
scenarios. 

BOMP Algorithm: 

1) Initialize the residual Tq = y, the index set Sq ~ 0. 

2) At the tth step (t > 1), we choose the block that is best 
matched to rt_i according to 

it = argmax||Afrt_i||2 (6) 

i 

3) Augment the index set and the matrix of chosen blocks: 
St = St-i U {it} and = A,J. We use the 
convention that ^f*-*^^ is an empty matrix. 

4) Solve a least squares problem to obtain a new signal 
estimate xt — argminx ||y — ^^'''x||2 

5) Calculate the new residual as = y — '4''^*^X( = y — 
Vq^wy, where V^a) = *(*)(*(*')t is the orthogonal 
projection onto the column space of 'S'^*-', and ^ stands 
for the pseudo-inverse. 



6) If |lrt||2 > e, return to Step 2; otherwise stop. 

III. Block-Sparsity Pattern Recovery Analysis 

Let Xnz denote a Kd dimensional column vector constructed 
by stacking the nonzero block components x;,V{/|x/ 7^ 0}, 



pm X Kd 



X 



denote a submatrix of A constructed by con- 
catenating the column-blocks A;,V{^|x/ 7^ 0}, i.e. the blocks 
corresponding to the nonzero x;, and let A^ e ^mx(L-K)d 
stand for a submatrix of A constructed by concatenating the 
column-blocks A/ corresponding to zero x;. For notational 
convenience, let Ii = {li,l2, ■ ■ ■ , Ik} denote a set of indices 
for which x;^. 7^ 0, and I2 = {Ik+i, Ik +2, ■ ■ ■ ,Il} denote a 
set of indices for which x;^ =0. Therefore we can write 

^nz — [ 

Anz — [ A;^ A.I2 
Az = [ ^Ik + 1 + 2 ■ ■ ■ ^II ] 

The measurements can therefore be written as 

y = An^Xnz + w (7) 

We can decompose the error vector w into w = T'a^w + 
w, where Pa,^ = AnzA^z denotes the orthogonal projec- 
tion onto the subspace spanned by the columns of Anz, and 
Vj^ = I — PAni is the orthogonal projection onto the null 
space of AjJ^. We can further write 



Ir- 

Ai, 



w =A,„Xnz + "Pa^w + T'An.W 
=A„,(xn, + At^w) + Vi^w 

= An7Xn7 + W 



(8) 



where Xn^ = Xnz + A^zW, and w = w. Equation ([8]) 
decomposes the measurements into two mutually orthogonal 
components: a signal component A„zXnz and a noise compo- 
nent w. The reason for doing so is that even the exact signal 
support (block-sparsity pattern) is known, there is no way to 
separate the noise projection term aJ,zW from the true signal 
Xnz. Hence it is more convenient to carry out our analysis 
based on ^ instead of (|7]i. 

Recall that, at each iteration, the BOMP algorithm searches 
for a block that is best matched to the residual vector according 
to We can define a greedy selection ratio that determines 
whether or not a correct block is selected at each iteration 



7t 



max;g/2 ||Afrt_i||2 



(9) 



maxie/i |lAfrt_i|l2 

where rt-i is the residual vector at iteration t — 1. Clearly, 
at each iteration, the algorithm picks an index whose corre- 
sponding block is in Anz if 7t < 1, otherwise an incorrect 
index whose corresponding block is in Az is chosen. Since 
the residual is orthogonal to the subspace spanned by all the 
previously chosen block-columns, no index will be chosen 
twice. Therefore, in order to recover the block-sparsity pattern, 
we need to guarantee 7t < 1 throughout the first K iterations, 
i.e. 74 < 1,V< < K. Here for simplicity, we assume that 
the number of nonzero blocks, K, is known a priori. In 
practice, K can be automatically determined by the BOMP 
algorithm given the error tolerance e (e can be estimated from 
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the observation noise power in practice). As long as K is not 
overestimated, i.e. K < K, we can ensure that all the chosen 
indices are from the set of correct indices Ii. 

In the following, we derive sufficient conditions that guar- 
antee 7f < 1 throughout the first K iterations. Before 
proceeding, we define a general mixed ^2/^p-norm (p — 
1, 2, oo) that will be used throughout this paper. For a vector 
z = [z[ Z2 ... Zq]-^ consisting of equal-length blocks with 
block size d, the general mixed ^2/^p-norm (with block size 
d) is defined as 



\\p where Vq 



(10) 



Correspondingly, for a matrix X G ^UdxQd^ where U and 
Q can be any positive integers, the mixed matrix norm (with 
block size d) is defined as 



X||2,p = max ■ 



IXzl 



2,P 



||z||2,p 



(11) 



Resorting to this general mixed ^2/^p-norm (with block size 
d) definition, the greedy selection ratio defined in (|9]l can be 
re-expressed as 



7t 



max{i.x,=o} ||Af rt_ 



1||2 



1 ||2,oo 



max{,:x,#o} ||Afrt_i||2 || A^^rt^i |l2,oo 



(12) 



Suppose that the BOMP algorithm has successfully executed 
the first k {k < K) iterations with residual 



Tfe = y - V^^y 



(13) 



where $1 e W^^^'^ is a matrix constructed by concatenating 
the k block-columns chosen from the previous k iterations, 
and V^^ = ^i^"} is the orthogonal projection onto the 
column space of $1. Note that #1 is a sub-matrix of Anz 
since we assume that the algorithm selected the correct indices 
during the first k iterations. Let #2 be a matrix constructed 
by concatenating the remaining K — k column-blocks in Anz. 
Without loss of generality, we can write Anz = [*i ^2], 
i.e. *i ^ [A;, ... A,J, and *2 = [Ai,^, ... AiJ. 
Also, we write Xnz — [x^ x^^' ••■ ^^J-^ ~ [</)f 4>2]'^' 
where <pi = [x^ 



Hi 

X ' 



and </)2 - [xi^+i 



x/.. 



Substituting (|8]i into (fT3] i. the residual can be written as 



Tk =AnzXnz + W - V^^ (AnzXnz + w) 



(a) 



-^nz^nz ^^i-^nz^nz H" W 
= *202 -^#1*202 +W 

= rfc + w 



(14) 



where (a) comes from the fact that w is orthogonal to the 
column space of $1, and (&) comes by noting that V^^^i = 
^•i, and in (c) we define = ^202 ^^*i*202- Using this 



result, the greedy selection ratio at iteration fc + 1 becomes 

!A^rfc||2,oo ||A^(rfe + w)||2,cx3 



Ik+i 



||A^,rfc||2^oo ||Aj,(rfc +w)||2,o 

w)||2,co («) _^^rfe||2,c 



lAHffc 



iiA;rzPfcii2,oo 

(fc)||A^PA„,ffe||2,oc 



|A;rzffeii2,o 



|A|f2rfe||2,oo 

|A,^'(Aiz)'^'Ai;,ffe||2,oo , ||A^w||2,oc. 



|Arz?fc||2,oo 



lA^zf/tlb, 
,T/At ^T| 



lA^zffcl 



<l|Ai(AL)^l|2,c 



l|Aiw||2„ 
\Alh\ 



2,00 



(15) 



where (a) comes from the fact the general mixed ^2/^p-iiorm 
satisfies the triangle inequality: ||a + b|j2,oo < lla-lb.oo + 
||b||2,oo> which can be readily verified, (6) follows from 
'PA„ Yk = ?/c since Yk lies in the column space of Anz. Our 
objective is to identify conditions assuring 7fe+i < 1. 

If the measurement process is perfect and noise-free, that 
is, y = Ax, then the greedy selection ratio is simply upper 
bounded by 



7fe+i < ||Af (Atj^|l2,c 



(16) 



Furthermore, it has been shown in [11, Lemma 4] that 
||A^(a1z)^||2,oo is upper bounded by 



< 



1 - {d - i)v - {K - l)d^l'si 



(17) 



Therefore the condition ^k+i < 1 holds universally if the 
block-coherence /ib and sub-coherence i' associated with the 
dictionary A satisfies 



KdfiB 



1- {d-l)iy- {K - l)dfiB 



< 1 



(18) 



Since, in practice, measurements are inevitably contaminated 
with noise and underlying uncertainties, it is thus important 
to understand the effect of measurement noise on the block- 
sparsity pattern recovery. Apparently, when noise is present, 
condition ( fTSl ) alone cannot guarantee the exact recovery of 
the block-sparsity pattern. Instead, from (fTSl ). we see that, to 
assure 7p+i < 1, we need 



lAz^(Atj^||2, 



Az w||2,oo 



< 1 



(19) 



l|A^zffc||2,._ 

The inequality (T% has to hold valid for 0<fc<ii'— lin 
order to ensure that the BOMP algorithm chooses the correct 
indices throughout the first K iterations. In the following, we 
provide sufficient conditions which guarantee ( fT9] l for < 
k < K — 1. The results are summarized as follows. 
Theorem 1: Let 



a; = ||A^w||2,oo = max||Afw|j2 



(20) 



denote the maximum correlation between the column block 
A; and the residual noise component w. Let 



a;b,min = mm X; 2 
leh 



(21) 
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the minimum ^2 -norm of the non-zero signal block compo- 
nents. Suppose that the following conditions are satisfied 



18]) 



(i) 1 - (d - - {2K - l)dfiB > 



(ii) 



[l-{d- 1)1^ ~ {2K~ l)d^B 
1 - (d- {K - l)d^B 



> 



(22) 



then we can guarantee that the BOMP algorithm selects indices 
from Ii throughout the first K iterations. If the error tolerance 
e is chosen such that the algorithm stops at the end of iteration 
K, then the BOMP recovers the exact block-sparsity pattern. 
Proof: See Appendix lAl ■ 
Theorem[T|is a generalization of the results presented in [1 1] 
which considered block-sparse signal recovery from noise-free 
measurements. To see this, for the noiseless case, we have 
u! ~ 0, and hence the condition ( |22] | is simplified as 



1 ~ (d - 1)1^ - {2K - l)d^iB >0 



(23) 



which is exactly the recovery condition provided in [11] for 
block-sparse signal recovery. On the other hand, for the noisy 
case, the success of the BOMP algorithm not only depends 
on the block-coherence /ie and the sub-coherence ly, but also 
depends on the ratio of the maximum correlation (between the 
column block A; and the residual noise component w) to the 
minimum £2 -norm of the nonzero signal block components 
x;,V/ e Ii. The importance of the minimum nonzero signal 
component in sparsity pattern recovery has been highlighted 
in [7], [8]. In particular, [7] showed that both the sufficient and 
necessary conditions require control of the minimum nonzero 
signal component. Our result suggests that, for block-sparse 
signal recovery, the minimum ^2 -norm of the nonzero signal 
block components, instead of the minimum magnitude of 
an entry, is the key quantity that controls the block subset 
selection. 

Also, we observe that the left-hand side of the second 
condition in (|22] | is strictly less than one. Therefore the ratio 
w/a;b,min cannot be greater than one, otherwise the condition 
cannot be met, irrespective of the choice of the sub-coherence 
1/ and the block-coherence /iB- The deterministic condition 
(l22T i. however, guarantees recovery of the sparsity pattern 
under the worst-case scenario and therefore is very pessimistic. 
If we take a probabilistic analysis (as in [18]) that ensures 
a probabilistic recovery, the condition can be significantly 
relaxed. This could be a direction of our future study. 



IV. Discussions 

We note that in this paper, as in [11], block-sparsity is 
explicitly exploited to yield a more relaxed condition imposed 
on the measurement matrix, and therefore lead to a guaranteed 
recovery for a potentially higher sparsity level. If the block- 
sparse signal is treated as a conventional ifd-sparse vector 
without exploiting knowledge of the block-sparsity structure, 
sufficient conditions for exact sparsity pattern recovery using 
OMP are given in [5, Theorem 18] and can be formulated as 
(by combining the first and the third equation in [5, Theorem 



(i) 1 - 2Kdn > 
(1 - 2KdnY 



lA^wll 



(24) 



where Xmin denotes the minimum magnitude of the nonzero 
signal elements in Xnz. When d = 1, block-sparsity reduces 
to conventional sparsity and we have — 0, fi^ — fJ'- The 
condition (|22] | is simplified as 



(i) 
(ii) 



1 - {2K-l)dfi > 
{1 - {2K - l)dn)^ 



> 



(25) 



1-{K-I)dn 

which is the same as (|24] | except that 2K and K in the 
numerator and denominator are replaced by 2K~-1 and K~ 1, 
respectively (It can be easily verified that (|25] l is slightly loose 
than (|24]|). When d > 1, in the special case that the columns 
of A; are orthonormal for each /, we have ly — and therefore 
the recovery condition ( |22] | becomes 



(i) 1-{2K - l)dfiB > 
^..^ [l-j2K_-l)di^ ^ 

1 - (K - l)rf^B Stnin 



(26) 



This recovery condition, ( l26b , is less restrictive than ( |24] | since 
we have 

[1 - {2K - l)dfiB]^ ^ (1 - 2Kd^iB? (1 - 2Kd^if 



1-{K- l)d/iB 



>- 



1 — Kd^B 
|A^w||^ W 



1 - Kd^l 



(27) 



where (a) comes from the fact that 1 — 2Kdfi > and /iB < A* 
[11, Proposition 2], (6) follows from uj < \/d||A^w||oo and 
a;b,min > V^Xmia- We See that through exploiting the block- 
sparsity, the sparsity pattern recovery condition is relaxed and 
we can guarantee a recovery of sparsity pattern with a higher 
sparsity level. A close examination of ( |27] | reveals that this 
improvement comes from two aspects. First, the measurement 
matrix requires a less restrictive mutual coherence condition 
since hb < A*- Second, for the same signal, noise, and mea- 
surement matrix, the quantity uj/xb,mm is always smaller than 
or equal to |j A^w||oo/a;min, meaning that exploiting block- 
sparsity can improve the ability of detecting weak signals 
buried in noise. 

If the individual blocks A; are, however, not orthonormal, 
then 1/ > 0, and i/ has to be small in order to result in 
a performance gain for block-sparsity recovery as compared 
with the conventional sparse recovery. We can also follow 
the orthogonalization approach [II] to analyze the general 
non-orthonormal case. We orthogonalize the individual blocks 
A; = A/V;, in which A; consists of orthonormal columns, 
and V; is an invertible matrix. The original dictionary can 
therefore be written as A = AV, where V is a block-diagonal 
matrix with blocks V;. Clearly, orthogonalization preserves 
the block-sparsity level. The comparison that is meaningful 
here is between the recovery based on the original model 
without exploiting block-sparsity and the recovery based on 
the orthogonalized model taking block-sparsity into account. 
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For the orthogonalized dictionary A, we have i^{A) = 0. 
Therefore we are only concerned about the relation between fi 
before orthogonalization and /ie after orthogonalization, which 
are denoted by /x(A) and /^b(A) respectively. Although an 
exact relation between /i(A) and /^b(A) is difficult to derive, 
it has been shown in [11] that if d > RL/{L — R), then we 
have /i(A) > /iB(A). Hence even for general dictionaries, 
exploiting block-sparsity still leads to a guaranteed sparsity 
pattern recovery for a potentially higher sparsity level by 
properly choosing the number of measurements to satisfy 
d > RL/{L - R). 

We explore the connection and difference between our work 
and [19], [20]. In [19], [20], the problem of simultaneous 
sparse approximation has been extensively studied and many 
interesting and elegant results were obtained under different 
performance metrics. Among them, the result most related to 
our work is [19, Theorem 5.3], which presents a sufficient 
condition for simultaneous sparse pattern recovery. The dif- 
ference between our work and [19], [20] lies in two aspects. 
First, the problem considered in this paper is more general 
than that of [19], [20] since simultaneous sparse approximation 
is a special form of block-sparse signal recovery with the 
measurement matrix having a block-diagonal structure and 
identical diagonal blocks. Second, block-sparsity is exploited 
in our paper to improve the recovery ability of dealing with a 
higher sparsity level, whereas for [19], [20], the simultaneous 
sparse approximation does not lead to a more relaxed condition 
on the dictionary as compared with the conventional single 
vector sparse approximation. 

V. Numerical Results 

We present numerical results to illustrate the sparsity pattern 
recovery performance of the BOMP algorithm. In the simula- 
tions, the dictionary is randomly generated with each entry 
independently drawn from Gaussian distribution with zero 
mean and unit variance. We then normalize each column of the 
dictionary to satisfy the unit-norm constraint. The dictionary 
is divided into consecutive blocks of length d. The support 
set of the block-sparse signal is randomly chosen according 
to a uniform distribution, and the signals on the support set 
are i.i.d. Gaussian random variables with zero mean and unit 
variance. The measurement noise vector is randomly generated 
with each entry drawn from Gaussian distribution with zero 
mean and variance cr^. 

To show the effectiveness of the BOMP algorithm, we 
compare it with the OMP algorithm that does not take block- 
sparsity into account. Fig. [T] shows the sparsity pattern recov- 
ery success rate as a function of the block-sparsity level, K. 
The sparsity pattern recovery is considered successful only 
if the algorithm determines all the correct support indices in 
the first K steps for the BOMP or in the first Kd steps for 
the OMP, supposing the block-sparsity level, K, is known 
a priori. The results are averaged over 1000 Monte Carlo 
runs, with the dictionary, the signal, and the noise randomly 
generated for each run. From Fig. [T] we observe that for both 
the BOMP and the OMP algorithms, the success rate decreases 
as the block-sparsity level, K, increases. Also, it can be seen 




-e— BOMP (0^=0.01) 
—I — OMP (0^=0.01) 
-O- ' BOMP (0^=0.05) 
-+- ' OMP (0^=0.05) 



3 4 5 6 

block-sparsity level (K) 



Fig. L Sparsity pattern recovery success rates of OMP and BOMP algorithms 
vs. block sparsity level, m = 40, n = 400, d = A, and L = 100. 
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-O- ' BOMP (0^=0.2) 
■ O " BOMP (0^=0.5) 




4 5 
block-sparsity level (K) 



Fig. 2. Sparsity pattern recovery success rate of BOMP algoritlim vs. bloclc 
sparsity level, m, = 40, n = 400, d = 4, and L = 100. 



that the BOMP algorithm presents a significant performance 
improvement over the OMP. The result corroborate our the- 
oretical claim that exploiting block-sparsity can lead to an 
improved recovery ability. Fig. |2] depicts the success rate of 
the BOMP algorithm under different noise power levels. We 
see that as the noise power increases, the recovery performance 
degrades. This observation is quite intuitive and coincides with 
our theoretical result since a higher noise power calls for a 
stricter requirement on the measurement matrix in order to 
satisfy the condition (l22l i. 

VI. Conclusion 

We studied the problem of recovering the sparsity pattern 
of block-sparse signals from noise-corrupted measurements. 
Our results showed that even in the presence of noise, the 
block-sparsity pattern can still be completely recovered via a 
block-version of the OMP algorithm when certain conditions 
are satisfied. Also, our analysis revealed that exploiting block- 
sparsity can lead to a guaranteed recovery for a potentially 
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higher sparsity level. This theoretical claim was also corrob- 
orated by our numerical results. 

Appendix A 
Proof of Theorem[T] 

To prove Theorem [T] we only need to prove that ( fT9] l holds 
for < fc < — 1 given the condition (|22] | satisfied. To this 
goal, we first derive an upper bound on the second term on 
the left-hand side (L.H.S.) of (O. 

The numerator of the second term on the L.H.S. of ( fT9] l is 
upper bounded by 



(28) 



To derive an upper bound on the second term on the L.H.S. 
of ( fT9l l. we need to obtain a lower bound on its denominator 
in terms of the block coherence parameter /zb and the sub- 
coherence parameter v. We have 



I 



ffc||2,oo =||A;f,(*202 

(a) 



7'*i*202)l|2,oo 
> ||A;;,*202l|2,oo - ||A^,7'*,*202l|2. 



(29) 



where (a) comes from the general mixed ^2 / ^p-norm triangle 
inequality. The first term on the right-hand side (R.H.S.) of 
can be further lower bounded as 



|A4*202l|2,oo 

max II A^$2'/'2ll2 = max 



> max 



(a) 

> max 



>(1 - (d- \)v 



ieii 



K 



J2 l|AfA,^.X;J|2 

{j\l,=ii.k+l<j<K} 



a -id 



1)'^)I|X»||2 



{j\lj^i,k+l<j<K} 

{K — fc — l)d^B) max | 

iG{ifc-|.i,...,iji-} 



x»||2j 
(30) 



where (a) comes from the fact that An,in(A^Ai) > l — {d—l)v 
(this fact comes directly from the Gershgorin Circle Theorem), 
and p{AfAj) < dfiB for i ^ j. On the other hand, the second 
term on the R.H.S. of ( |29] l can be upper bounded by (Please 
see Appendix |B] for the detailed derivation) 

||Af,7'*,*202ll2,oo <dAiB(i^-fc) max ||x,||2 (31) 
Combining (|29]l- (l3Tl i. ( |29] l is further lower bounded by 



l|A^,rfe||2,oo 
>(1 - (d - - {2K -2k- l)d^B) 



max I 



X»|!2 

(32) 



>(1 -{d- - {2K -2k- l)dnB)xbMn 

Since (fTSl l is a necessary condition for ( fT9] l, we should always 
have 1 — {d — — {2K — l)dfiB > 0. Therefore we can 



guarantee that the above derived lower bound is positive. 
Consequently an upper bound on the second term on the 
L.H.S. of (fT9] i can be derived and given as 



lA^wll 



<- 



I Aj,rfe||2,oo " (1 - (rf - - {2K -2k- l)dfiB)xi,, 

U! 



<- 



[l - {d - - i2K - l)dnB)x^,,r 



(33) 



We see that the first and the second term on the L.H.S. of ( fT9l ) 
are respectively upper bounded by (fTTI i and ( |33]) . Therefore 
( fT9] l is guaranteed if the summation of these two upper bounds 
are smaller than unity, i.e. 



Kd^iB 



{d- {K - ^)d^lE 



(1 - {d - l)u - {2K - l)dfiB)xb, 



< 1 



(34) 



A further transformation easily shows that ( l34l i and ( |22] | are 
equivalent (note that the condition l—{d—l)i'—{2K—l)dfj,B > 
has to be explicitly indicated to assure ( fTSI l and to assure the 
positiveness of the lower bound (|32]|). The proof is completed 
here. 



Appendix B 
Derivation of Equation (l3T 



Clearly we have 



|A^,P*,*2</'2ll2,oc =max||Afp*,*2</'2ll2 (35) 



We consider two different cases. If A^ is a column-block of 
#1, i.e. i G {li, . . . , Ik}, then for any index i, we have 



|AfP*,*202l|2 = l|Af*2</'2l|2 = 



K 



E ^f^h'^h 



K K 

< E \\Aj Ai^^i^h < d^iB E Il^'.ll2 

j=k+l j=k+l 

<dpB{K — k) max ||x.i||2 

i<^{lk + l,---.lK} 



(36) 



where (a) comes from the fact that ^JVqy^ = 
*f = *f , and therefore AjV^, = Af for 

i S {/i, . . . , Ik}- On the other hand, if A^ is a column-block 
of $2, i-e- * G {/fc+i, . ■ . , Ik}- We show that 



max ||Afp*,*202ll2 > . ^ max || Af P*,*202ll2 

(37) 
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To this goal, let z = *|*202 = 

the L.H.S. of ( [37] i is lower bounded as 



z?"]"^, the term on [8] 



max ||A/ 7^4.^*202112 max ||A/ *iz||2 



max 
ie{ii,...jfc} 



b'b"7^94<j<fc} 

>(l-(d-l)^.)||z,||2-dA*B E 11^^112 

{ib¥<?4<i<fe} 

>(l-(d-l)i.-(fc-l)d^B)||Zg||2 



(38) 



where in (a), the index g is chosen such that Zg has the 
maximum ^2-norm among {zijjLj^. The term on the R.H.S. 
of ( |37] i is upper bounded by 



max IjA,, 7'*i*202ll2 = . ,max 



|Af*iz||2 



[9] 
[10] 

[11] 

[12] 
[13] 
[14] 
[15] 

[16] 



: max 



^AfA^^z, < max V || Af A^^.z ||2 

.^^ 2 ■'e{(fc+i,...,(K} ^.^^ [17] 

(39) [18] 



i=i 

Since we have \ - {d - \)v - {2K - l)dfiB > in order to [19] 

assure the condition JTSl l to be satisfied, we can easily verify 

that the following always holds for < k < K [20] 



(40) 



(1 - (d - - (fc - l)d^B) > kd^iB 

The inequality (|37| i comes directly by combining 
Therefore the second term on the R.H.S. of ( |29] l is upper 
bounded by 

||A^P*,*2</'2ll2,oo -max|!Afp*,*202l|2 

= max ||Af7'#j*202ll2 

iG{ii,...,ifc} 

<diiB{K-k) max ||xi||2 (41) 

ie{ife+i,...,ijr} 
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